Two-stage Automatic Speech Summarization by Sentence Extraction and Compaction
نویسندگان
چکیده
This paper proposes a new automatic speech summarization method having two stages: important sentence extraction and sentence compaction. Relatively important sentences are extracted from the results of large-vocabulary continuous speech recognition (LVCSR) based on the amount of information and the confidence measures of constituent words. The set of extracted sentences is compressed by our sentence compaction method. Sentence compaction is performed by selecting a word set that maximizes a summarization score which comprises the amount of information and the confidence measure of each word, the linguistic likelihood of word strings, and the word concatenation probability. The selected words are concatenated to create a summary. Effectiveness of the proposed method was confirmed by testing summarization of spontaneous presentations. Optimal ratio of sentence extraction to sentence compaction changes according to the target summarization ratio and features of presentations.
منابع مشابه
Automatic speech summarization based on sentence extraction and compaction
This paper proposes a new automatic speech summarization method having two stages: important sentence extraction and sentence compaction. Relatively important sentences are extracted based on the amount of information and the confidence measures of constituent words, and the set of extracted sentences is compressed by our sentence compaction method. The sentence compaction is performed by selec...
متن کاملSpeech Summarization: An Approach through Word Extraction and a Method for Evaluation
In this paper, we propose a new method of automatic speech summarization for each utterance, where a set of words that maximizes a summarization score is extracted from automatic speech transcriptions. The summarization score indicates the appropriateness of summarized sentences. This extraction is achieved by using a dynamic programming technique according to a target summarization ratio. This...
متن کاملمقایسه روشهای مختلف یادگیری ماشین در خلاصهسازی استخراجی گفتار به گفتار فارسی بدون استفاده از رونوشت
In this paper, extractive speech summarization using different machine learning algorithms was investigated. The task of Speech summarization deals with extracting important and salient segments from speech in order to access, search, extract and browse speech files easier and in a less costly manner. In this paper, a new method for speech summarization without using automatic speech recognitio...
متن کاملA Study on Statistical Methods for Automatic Speech Summarization
This dissertation proposes a new automatic speech summarization method through word extraction. In this method, a set of words maximizing a summarization score indicating an appropriateness of summarization is extracted from automatically transcribed speech. This extraction is performed according to a target compression ratio using a dynamic programming technique sentence by sentence. The extra...
متن کاملIntonational phrases for speech summarization
Extractive speech summarization approaches select relevant segments of spoken documents and concatenate them to generate a summary. The extraction unit chosen, whether a sentence, syntactic constituent, or other segment, has a significant impact on the overall quality and fluency of the summary. Even though sentences tend to be the choice of most the extractive speech summarizers, in this paper...
متن کامل